.TH BIBTEXU "1" "30 August 2022" "bibtexu 4.00" "User Commands"
.SH NAME
bibtexu \- UTF-8 Big BibTeX
.SH SYNOPSIS
.B bibtexu
[\fIoptions\fR] \fIaux-file\fR
.SH DESCRIPTION
.PP
.B BibTeXu
is the Unicode-compliant version of BibTeX.
It is largely based on Niel Kempson's BibTeX8, and it provides
a better support for UTF-8 by integrating ICU library.  Therefore,
.B BibTeXu
no longer requires the Codepage and Sort order ("CS") file; instead,
the method of sorting and case-changing can be controlled via
command-line options.
.SH OPTIONS
.TP
\fB\-?\fR  \fB\-\-help\fR
display some brief help text.
.TP
\fB\-d\fR  \fB\-\-debug\fR TYPE
report debugging information.  TYPE is one
or more of all, csf, io, mem, misc, search.
.TP
\fB\-s\fR  \fB\-\-statistics\fR
report internal statistics.
.TP
\fB\-t\fR  \fB\-\-trace\fR
report execution tracing.
.TP
\fB\-v\fR  \fB\-\-version\fR
report BibTeX version.
.TP
\fB\-l\fR  \fB\-\-language\fR LANG
use language LANG to convert strings to low case.
This argument is passed to ICU library.
.TP
\fB\-o\fR  \fB\-\-location\fR LANG
use language LANG for sorting.
This argument is passed to ICU library.
.TP
\fB\-B\fR  \fB\-\-big\fR
set large BibTeX capacity.
.TP
\fB\-H\fR  \fB\-\-huge\fR
set huge BibTeX capacity.
.TP
\fB\-W\fR  \fB\-\-wolfgang\fR
set really huge BibTeX capacity for Wolfgang.
.TP
\fB\-M\fR  \fB\-\-min_crossrefs\fR ##
set min_crossrefs to ##.
.TP
\fB\-\-mstrings\fR ##
allow ## unique strings.
.SH UNICODE SUPPORT
.PP
.B BibTeXu
supports extended features to handle Unicode characters.
Several built-in functions in bibliography styles are enhanced as follows.
.TP
\fB&\fR
Pops the top two (integer) literals and pushes their bitwise AND.
.TP
\fB|\fR
Pops the top two (integer) literals and pushes their bitwise OR.
.TP
\fBadd.period$\fR
Pops the top (string) literal, adds a `.' to it if the last non`}'
character isn't a `.', `?', `!' or a Unicode punctuation mark and pushes this resulting string.
The mark may be U+203C, U+203D, U+2047, U+2048, U+2049, U+3002, U+FF01, U+FF0E or U+FF1F.
.TP
\fBchr.to.int$\fR
Pops the top (string) literal, makes sure it's a multibyte string of a single Unicode code point,
converts it to the corresponding Unicode scalar value (integer), and pushes this integer.
.TP
\fBint.to.chr$\fR
Pops the top (integer) literal, interpreted as the Unicode scalar value of a single code point,
converts it to the corresponding single character multibyte string, and pushes this string.
.TP
\fBnum.names$\fR, \fBformat.name$\fR
The function is the same as original BibTeX but
an Ideographic/Fullwidth Comma (U+3001, U+FF0C) in addition to an " and " string is
accepted as a separator between persons and
Ideographic Space (U+3000) in addition to a space " " is accepted as a separator between a family name and a given name.
.TP
\fBsubstring$\fR, \fBtext.length$\fR, \fBtext.prefix$\fR
The function is the same as original BibTeX but the unit of operand numbers is Unicode code point.
.TP
\fBchange.case$\fR
The function is the same as original BibTeX but letters of
non-english Latin, Greek and Cyrillic are supported.
.TP
\fBwidth$\fR
The function is the same as original BibTeX but letters of Latin-1 and Latin Extended-A
and CJK characters are supported.
.TP
\fBis.cjk.str$\fR
Pops the top (string) literal, set flag bits to an integer if CJK characters are found in the string,
and pushes the resulting integer, otherwise pushes 0.
Flags 0x001, 0x002, 0x004, 0x008 and 0x800 are corresponding to Hanzi (Kanji, Hanja), Kana, Hangul,
Bopomofo and other CJK characters, respectively.
For example, an integer 0x003 will be pushed if Hanzi and Kana characters are found in a poped string literal.
.TP
\fBis.kanji.str$\fR
Same as \fBis.cjk.str$\fR for compatibility with (u)pBibTeX.
.SH SEE ALSO
More detailed description of
.B BibTeXu
is available at $TEXMFDIST/doc/bibtexu/README.
.SH AUTHORS
.B BibTeXu 
was written by Yannis Haralambous and his students.
It is maintained as part of TeX Live.
.PP
This manpage was written for TeX Live.
